Search CORE

75 research outputs found

Zero-Shot Learning by Convex Combination of Semantic Embeddings

Author: Bengio Samy
Corrado Greg S.
Dean Jeffrey
Frome Andrea
Mikolov Tomas
Norouzi Mohammad
Shlens Jonathon
Singer Yoram
Publication venue
Publication date: 21/03/2014
Field of study

Several recent publications have proposed methods for mapping images into continuous semantic embedding spaces. In some cases the embedding space is trained jointly with the image transformation. In other cases the semantic embedding space is established by an independent natural language processing task, and then the image transformation into that space is learned in a second stage. Proponents of these image embedding systems have stressed their advantages over the traditional \nway{} classification framing of image understanding, particularly in terms of the promise for zero-shot learning -- the ability to correctly annotate images of previously unseen object categories. In this paper, we propose a simple method for constructing an image embedding system from any existing \nway{} image classifier and a semantic word embedding model, which contains the \n class labels in its vocabulary. Our method maps images into the semantic embedding space via convex combination of the class label embedding vectors, and requires no additional training. We show that this simple and direct method confers many of the advantages associated with more complex image embedding schemes, and indeed outperforms state of the art methods on the ImageNet zero-shot learning task

arXiv.org e-Print Archive

CiteSeerX

Toward Online Measurement of Decision State

Author: Corrado Greg S.
Johnston James C.
Lachter Joel
McClelland James L.
Publication venue
Publication date: 19/11/2009
Field of study

In traditional perceptual decision-making experiments, two pieces of data are collected on each trial: response time and accuracy. But how confident were participants and how did their decision state evolve over time? We asked participants to provide a continuous readout of their decision state by moving a cursor along a sliding scale between a 100% certain left response and a 100% certain right response. Subjects did not terminate the trials; rather, trials were timed out at random and subjects were scored based on the cursor position at that time. Higher rewards for correct responses and higher penalties for errors were associated with extreme responses so that the response with the highest expected value was that which accurately reflected the participant's odds of being correct. This procedure encourages participants to expose the time-course of their evolving decision state. Evidence on how well they can do this will be presented

NASA Technical Reports Server

Building high-level features using large scale unsupervised learning

Author: Chen Kai
Corrado Greg S.
Dean Jeff
Devin Matthieu
Le Quoc V.
Monga Rajat
Ng Andrew Y.
Ranzato Marc'Aurelio
Publication venue
Publication date: 01/01/2012
Field of study

We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. For example, is it possible to learn a face detector using only unlabeled images? To answer this, we train a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 million 200x200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also find that the same network is sensitive to other high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained our network to obtain 15.8% accuracy in recognizing 20,000 object categories from ImageNet, a leap of 70% relative improvement over the previous state-of-the-art

arXiv.org e-Print Archive

CiteSeerX